Search CORE

89 research outputs found

Locality and Singularity for Store-Atomic Memory Models

Author: A Bouajjani
A Bouajjani
A Bouajjani
A Dan
D Shasha
E Derevenetc
J Alglave
J Alglave
J Alglave
J Alglave
J Alglave
J Alglave
J Burnim
J Nieplocha
L Lamport
L Lamport
M Kuperstein
MF Atig
MF Atig
P Sewell
PA Abdulla
PA Abdulla
R Machado
S Burckhardt
S Burckhardt
V Vafeiadis
Y Meshman
Publication venue
Publication date: 14/03/2017
Field of study

Robustness is a correctness notion for concurrent programs running under relaxed consistency models. The task is to check that the relaxed behavior coincides (up to traces) with sequential consistency (SC). Although computationally simple on paper (robustness has been shown to be PSPACE-complete for TSO, PGAS, and Power), building a practical robustness checker remains a challenge. The problem is that the various relaxations lead to a dramatic number of computations, only few of which violate robustness. In the present paper, we set out to reduce the search space for robustness checkers. We focus on store-atomic consistency models and establish two completeness results. The first result, called locality, states that a non-robust program always contains a violating computation where only one thread delays commands. The second result, called singularity, is even stronger but restricted to programs without lightweight fences. It states that there is a violating computation where a single store is delayed. As an application of the results, we derive a linear-size source-to-source translation of robustness to SC-reachability. It applies to general programs, regardless of the data domain and potentially with an unbounded number of threads and with unbounded buffers. We have implemented the translation and verified, for the first time, PGAS algorithms in a fully automated fashion. For TSO, our analysis outperforms existing tools

arXiv.org e-Print Archive

Crossref

Fraunhofer-ePrints

A Case Study in Tightly Coupled Multi-paradigm Parallel Programming

Author: A. Ferrari
A. Gursoy
A.L. Lastovetsky
C.-C. Chiang
G. Zheng
J. Leichtl
J. Nieplocha
L.V. Kale
R. Abedi
S.-E. Choi
T. El-Ghazawi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Crossref

An efficient parallelization scheme for molecular dynamics simulations with many-body, flexible, polarizable empirical potentials: application to water

Author: A Rahman
B Quentrec
BR Brooks
BT Thole
CJ Burnham
CJ Burnham
DA Pearlman
DE Shaw
George S. Fanourgakis
GS Fanourgakis
GS Heffelfinger
H Partridge
J Nieplocha
J Nieplocha
JA Barker
Jarek Nieplocha
JC Grossman
JR Reimers
L Verlet
LX Dang
M Allesch
M Snir
MH Willebeek-LeMair
MP Allen
MW Mahoney
P Ahlström
S Nosé
S Plimpton
Sotiris S. Xantheas
TM Nymand
U Essmann
Vinod Tipparaju
W Smith
WC Swope
WH Press
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Recommended from our members

A Component Architecture for High-Performance Scientific Computing

Author: Allan B. A.
Armstrong R.
Bernholdt D. E.
Bertrand F.
Chiu K.
Dahlgren T. L.
Damevski K.
Elwasif W. R.
Epperly T. W.
Govindaraju M.
Katz D. S.
Kohl J. A.
Krishnan M.
Kumfert G.
Larson J. W.
Lefantzi S.
Lewis M. J.
Malony A. D.
McInnes L. C.
Nieplocha J.
Norris B.
Parker S. G.
Ray J.
Shende S.
Windus T. L.
Zhou S.
Publication venue: Lawrence Livermore National Laboratory
Publication date: 14/12/2004
Field of study

The Common Component Architecture (CCA) provides a means for software developers to manage the complexity of large-scale scientific simulations and to move toward a plug-and-play environment for high-performance computing. In the scientific computing context, component models also promote collaboration using independently developed software, thereby allowing particular individuals or groups to focus on the aspects of greatest interest to them. The CCA supports parallel and distributed computing as well as local high-performance connections between components in a language-independent manner. The design places minimal requirements on components and thus facilitates the integration of existing code into the CCA environment. The CCA model imposes minimal overhead to minimize the impact on application performance. The focus on high performance distinguishes the CCA from most other component models. The CCA is being applied within an increasing range of disciplines, including combustion research, global climate simulation, and computational chemistry

UNT Digital Library

Automatic code generation for many-body electronic structure methods: the tensor contraction engine‡‡

Author: Aho A
Alexander A. Auer
Alexander Sibiryakov
Alina Bibireata
Chi-Chung Lam
Daniel Cociorva
David E. Bernholdt
Garey M
Gerald Baumgartner
J. Ramanujam
Janssen C
Janssen CL
Jones C
Lindgren I
Marcel Nooijen
Monkhorst HJ
Mukherjee D
Nieplocha J
P. Sadayappan
Qingda Lu
Robert Harrison
Russell Pitzer
Sandhya Krishnan
Sriram Krishnamoorthy
Straatsma TP
Venkatesh Choppella
Werner H-J
Xiaoyang Gao
Yanai T
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

A Novel Approach to Parallel Coupled Cluster Calculations: Combining Distributed and Shared Memory Techniques for Modern Cluster Based Systems

Crossref

Reactive transport codes for subsurface environmental simulation

Author: A Chilakapati
A Gräbe
A Navarre-Sitchler
A Verma
AC Lasaga
AK Navarre-Sitchler
AK Singh
AMM Leal
AP Vanselow
B Berkowitz
B Nowack
B. Arora
BD Gibson
BE Rittmann
C Beyer
C Dalkhaa
C Liu
C Monnin
C Neuzil
C Tournassat
C Wanner
C-H Park
C-H Park
C. A. J. Appelo
C. I. Steefel
CAJ Appelo
CAJ Appelo
CE Harvie
CI Steefel
CI Steefel
CI Steefel
CI Steefel
CI Steefel
CI Steefel
CI Steefel
CI Steefel
CI Steefel
D Dabo
D Jacques
D Jacques
D Jacques
D Jacques
D. Jacques
D. Moulton
DA Dzombak
DA Kulik
DA Kulik
DB Kent
DC Thorstenson
DJ Goode
DJ Kirkner
DT Snow
EL Sonnenthal
EM Thaysen
ER Giambalvo
ESP Aradóttir
ESP Aradóttir
F Centler
F Gérard
F Van Zeggeren
G Sposito
G. T. Yeh
GE Hammond
GI Barenblatt
GJ Hooyman
GR Miller
GT Yeh
GT Yeh
GT Yeh
GT Yeh
GT Yeh
H Prommer
H Shao
H. Shao
HA Van der Sloot
HC Slider
I Wallis
I-S Liu
J Nieplocha
J Rutqvist
J van der Lee
J Šimu̇nek
J Šimu̇nek
J Šiu̇nek
J-O Delfs
J. C. L. Meeussen
J. Šimůnek
JA Davis
JA Davis
JA Meima
JC Meeussen
JD Filius
JG Farmer
JL Druhan
JL Druhan
JP Gwo
JPM Vink
JS Geelhoed
K Maher
K Maher
K Pruess
K Pruess
K Rink
K. U. Mayer
KG Zuurbier
KS Pitzer
KT MacQuarrie
KU Mayer
KU Mayer
L Cheng
L De Windt
L De Windt
L De Windt
L De Windt
L De Windt
L Li
L Li
L Li
L Li
L Luckner
L Weng
L Zheng
LM de Vries
M Debure
M Xie
MA Celia
MD White
MD White
MH Reed
MJ Cheadle
MT Van Genuchten
MW Saaltink
MZ Wu
N Böttcher
N Jacquemet
N Spycher
N Watanabe
N. Spycher
NCM Marty
O Kolditz
O Kolditz
O Kolditz
O. Kolditz
P Aagaard
P Audigane
P. C. Lichtner
PC Lichtner
PC Lichtner
PF Dobson
Q Jin
R Aris
RJ Millington
RM Bowen
RT Amos
RT Amos
S Bea
S Finsterle
S Kräutle
S Molins
S Molins
S Molins
S Molins
S Mukhopadhyay
S Panday
S Sarkar
S. B. Yabusaki
S. Molins
SA Bea
SP Neuman
SR Charlton
T Henderson
T Kalbacher
T Nagel
T Wolery
T Xu
T Xu
T Xu
T Xu
T Xu
T Xu
T. Kalbacher
V Lagneau
V Lagneau
V. Lagneau
W Wang
WH Chiang
WH Van Riemsdijk
Y Fang
YL Fang
YS Wu
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Noncollective Communicator Creation in MPI

Author: G.K. Schenter
J. Nieplocha
J. Nieplocha
J. Nieplocha
M. Kamiya
R.L. Graham
T.L. Windus
W.D. Gropp
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Crossref

Shared Memory NUMA Programming on I-WAY

Author: J. Nieplocha
Nieplocha And Harrison
R. J. Harrison
Publication venue: IEEE Computer Society Press
Publication date
Field of study

The performance of the Global Array shared-memory nonuniform memory-access programming model is explored on the I-WAY, wide-area-network distributed supercomputer environment. The Global Array model is extended by introducing a concept of mirrored arrays. Latencies and bandwidths for remote memory access are studied, and the performance of a large application from computational chemistry is evaluated using both fully distributed and also mirrored arrays. Excellent performance can be obtained with mirroring if even modest (0.5 MB/s) network bandwidth is available

CiteSeerX